Intact predictive processing in autistic adults: evidence from statistical learning

Impairment in predictive processes gained a lot of attention in recent years as an explanation for autistic symptoms. However, empirical evidence does not always underpin this framework. Thus, it is unclear what aspects of predictive processing are affected in autism spectrum disorder. In this study, we tested autistic adults on a task in which participants acquire probability-based regularities (that is, a statistical learning task). Twenty neurotypical and 22 autistic adults learned a probabilistic, temporally distributed regularity for about 40 min. Using frequentist and Bayesian methods, we found that autistic adults performed comparably to neurotypical adults, and the dynamics of learning did not differ between groups either. Thus, our study provides evidence for intact statistical learning in autistic adults. Furthermore, we discuss potential ways this result can extend the scope of the predictive processing framework, noting that atypical processing might not always mean a deficit in performance.

In the past years, several frameworks emerged to explain the neurocognitive mechanisms behind autism spectrum disorder (ASD). A line of research suggests that autistic behavior might emerge due to an atypical ability to predict future events based on experience and current sensory input-that is, predictive processing. The predictive processing framework originated from perception research: according to it, the brain generates hypotheses about the environment during perception based on previous experiences (priors) and updates the hypotheses using the prediction errors, that is, the differences between the predictions and the actual sensory inputs 1 . This framework has since been extended to a general framework for understanding brain functioning, including learning and memory 2,3 . It might benefit the understanding of ASD, and thus, help develop better supporting systems and interventions.
Various approaches to the predictive processing framework offer explanations for autistic traits by highlighting atypicalities in different components of the process. One of them assumes that autistic individuals tend to attribute a high and inflexible precision to prediction errors 3 . According to this view, autistic people would systematically adjust their internal representation of the world after each (minor) prediction error, instead of considering that some of these errors might simply signal unavoidable noise. Importantly, such errors indicate to the learner that the regularity is not fully learned yet 3,4 . Another viewpoint proposes that ASD individuals rely more on incoming sensory data (i.e., bottom-up information) compared to their prior experiences (i.e., top-down processes), which may result in less adaptive behaviour [4][5][6][7][8] . Lastly, in ASD, atypical predictive processing may arise from an inaccurate estimation of the extent to which environmental regularities change (as opposed to the estimation of the noise in the regularity itself, as mentioned above, see 9 for different types of uncertainties), that is, the estimation of volatility 10 . Autistic people tend to overestimate volatility, even at the expense of learning environmental probabilities 8 . Altered or impaired predictive processes could explain sensory hypersensitivity 3,11 , deficits in sociocognitive skills 12 , and rigid habit-like behaviour in ASD (e.g. 3,4,10 ). Despite its potential as a

Methods
Participants. In total, 45 participants were recruited for the study. Three neurotypical participants were excluded from the analysis due to errors in the data collection. Thus, the data of 42 participants were entered into the analyses, 20 of them were neurotypicals, and 22 of them had a diagnosis of ASD. Neurotypical participants were screened for diagnoses of any psychiatric or neurological disorders, and none of them scored higher on the autism spectrum quotient (AQ) questionnaire than 27, which means that they do not tend to show autistic behavioral patterns 49 . ASD diagnoses were provided by trained clinicians; both childhood scores of autism diagnostic interview-revised (ADI-R) and autism diagnostic observation schedule, IV-module (ADOS-IV) 50,51 confirmed the diagnosis. We screened ASD participants for comorbid disorders: 12 of them had at least one of the following: attention deficit hyperactivity disorder (5), obsessive-compulsive disorder (3), generalized anxiety disorder (2), bipolar disorder (1), depression (1), and schizophrenia (1). Having an intellectual disability, language impairment, or active psychosis were exclusion criteria. Neurotypical participants were recruited by advertisement, while participants with ASD were recruited from the outpatient unit of the Department of Psychiatry and Psychotherapy, Semmelweis University. No participant received financial compensation for their participation.
The two groups did not differ in age, gender distribution, and years of education, see Table 1. All participants provided written informed consent. The study was conducted in accordance with the Declaration of Helsinki of 1975, as revised in 2008 and it was approved by the Regional and Institutional Committee of Science and Research Ethics, Semmelweis University, Budapest, Hungary (SERKEB No.: 145/2019). The experiment took place at the Laboratory of Brain, Memory and Language Lab, Eötvös Loránd University, Budapest.
Task and procedure. To measure statistical learning, we applied the ASRT task 18 , a commonly used and highly reliable task (e.g. 52 ). In this task, participants saw four empty circles on a white background, horizontally arranged on the screen. A target stimulus (a dog's head) appeared in one of the four locations. Participants were asked to press the button corresponding to the location of the appearing stimuli (Y, C, B, and M keys of a QWERTZ keyboard corresponded to the first, second, third, and fourth circle, from left to right respectively), www.nature.com/scientificreports/ using their right and left index and middle fingers. Participants were told that the goal of the task is to be as fast and as accurate as possible. Unknown to them, however, the serial order of the stimulus locations followed a specific structure: every second stimulus appeared randomly in one of the four possible locations, but every first element appeared systematically in the same order. Thus, these alternating elements formed an eight-element probabilistic sequence (e.g., 1r2r4r3r, where the numbers indicate the location of the elements belonging to the pattern, and r indicates a random position out of the four, see Fig. 1). Due to this structure, some combinations of three consecutive trials (triplets) were more likely to be formed. In the above example, 1 x 2, 2 x 4, 4 x 3, and 3 x 1 are high-probability triplets (where "x" indicates the middle element of the triplet, regardless of whether it is random or belongs to the pattern)-they can be both formed by two pattern and one random elements (PrP), or two random elements enclosing a pattern one (rPr). Out of the total of 64 possible triplets, 16 were highprobability triplets. Any other triplet (such as 1 x 3 or 2 x 1) cannot be formed by two pattern, and one random elements-thus, they occurred with low probability. Importantly, if participants perform with decreased RT and higher accuracy on the last element of a high-probability triplet (e.g., 2 in the above-mentioned 1 x 2 triplet) compared to the last element of a low-probability triplet (e.g., 3 in the above-mentioned 1 x 3 triplet), it means that the participant learned to predict the former one based on the preceding two elements, thus, acquired the underlying probability structure of the task. There were 48 low-probability triplets in this task. This task structure resulted in the following statistical structure: 50% of the trials were the last trial of a high-probability triplet formed by two pattern elements and one random (pattern-random-pattern), 12.5% of all trials were the last elements of a random-ending high-probability triplet (random-pattern-random). Therefore, high-probability triplets occurred with 62.5%, while low-probability triplets occurred with 37.5% overall probability. On the unique triplet level, high-probability triplets occurred with a 4% probability (62.5%/16), while low-probability ones occurred with a 0.8% probability (37.5%/48). As the last element of a high-probability triplet was more predictable than a low-probability triplet, we defined statistical learning as the difference in reaction times (RT) and accuracy performance between these triplet types. For further details of the ASRT task structure, see Fig. 1.
The task was divided into 40 blocks in total. Each block contained 85 trials: five random elements at the beginning (these were excluded from the analysis later), and an eight-elements alternating sequence ten times, as described above. The task was self-paced: the target stimulus remained on the screen until the first correct response, and the response-stimulus interval (RSI) was 120 ms, during which participants saw the four empty circles. Between blocks, participants received feedback on their RT and accuracy and could rest awhile. To reduce noise due to intra-individual variability in the analysis, we merged five blocks into one unit of analysis called an epoch.
To familiarize the participants with the ASRT task and to make sure they understood the instructions, participants first performed two blocks without the pattern (that is, all trials were random). After that, participants were asked to perform 8 epochs, with a ~ 15-min-long break after the 4th epoch. Despite the ASRT task being shown to be truly implicit (that is, no conscious knowledge is formed regarding the regularities hidden in the task, see 53 ), once the ASRT was over, we administered a short questionnaire to make sure that none of the participants gained explicit knowledge of the structure of the task. It consisted of two questions increasingly specific to the nature of the structure: "Have you noticed anything special regarding the task?", and "Have you noticed some regularity in the sequence of stimuli?". According to this questionnaire, none of our participants gained conscious knowledge of the regularity.

Statistical analysis.
Statistical analyses were carried out using JASP 0.16.1.0 54 , and data preparation and visualization were conducted using Python 3.8, using pandas, NumPy, os, matplotlib, and seaborn packages [55][56][57] . First, we determined about each trial in a sliding window manner whether, based on the two elements preceding it, they were the last element of a high-or a low-probability triplet (for the sake of simplicity, henceforth referred to as high-probability and low-probability triplets). That is, considering the example in Fig. 1, if the stimuli followed the "13214232" order, first, trial "2" was categorized as a high-probability triplet (1 3 2) element. Then, trial "1" was categorized as a high-probability triplet (3-2-1) element again, and so on. After this categorization, we excluded the last elements of trill (e.g., 2 1 2), and repetition (e.g., 2 2 2) triplets since participants show a pre-existing tendency to react faster to these elements, thus, they can bias the RTs 58 . We also screened for outlier www.nature.com/scientificreports/ trials using a boxplot, meaning that we excluded all trials where the RT fell outside the range of 1.5 inter-quartile distance (IQD) from the first quartile and 1.5 IQD from the third quartile. With this method, we excluded 5.83% of all trials in the entire sample (5.46% in the neurotypical, and 6.17% in the ASD group). Using the remaining data, we calculated the mean accuracy and median RT in each epoch, separately for high-and low-probability triplets. On these data, we performed a mixed-design ANOVA described in the Results section. When applicable, pairwise comparisons were performed using Holm correction.
Additionally to the frequentist statistics, we performed Bayesian analyses using default JASP priors, to be able to detect null results. Based on the BF 01 values (which indicate the ratio of the likelihood of the null hypothesis to the likelihood to the alternative hypothesis), we calculated Bayes Factor exclusion (BF excl ) values. We compared the models to the null model (which included the subject variable and random slopes) in each case, and we calculated BF excl values across matched models. BF excl values indicate the likeliness of a model that does not include the given effect as opposed to the one that does. The BF excl values above one rather support the exclusion of the given factor from the model, while values below one support the inclusion 59 . Values close to one mean that there is not enough evidence to support either inclusion or exclusion. We suggest a similar interpretation of these values as that of BF 01 scores: a score above three means substantial evidence in favour of the null hypothesis, while a www.nature.com/scientificreports/ score between 0.33 and 1 indicates anecdotal evidence, while a score below 0.33 substantial evidence in favour of the alternative hypothesis 60,61 . For the sake of transparency, however, we reported BF 01 values and errors (%) in Supplementary Materials S1 Table. The data are available at https:// osf. io/ mebcx/.
Significance statement. According to the predictive processing framework, autistic symptoms are the result of the weak ability to predict future events based on prior knowledge and sensory input. Despite its popularity, the validity of this framework and its limitations are still unclear. Here, we aim to test the predictive processing framework in autism by using a temporal statistical learning task. We found intact predictive processing in autism-neither the amount of learning nor the dynamics of it were altered. Our result challenges the predictive processing framework of autism. However, we suggest an update of the framework to better explain existing data and deepen our understanding of autism.

Discussion
In this study, we aimed to test the statistical learning of autistic adults in light of the predictive processing framework. Besides the overall statistical learning, we also tested the dynamics of the learning process-which, to our best knowledge, has not been addressed in autistic adults before. We also performed exploratory analyses to find individual differences regarding the autistic symptom severity, which are reported in the SM (see Supplementary  Information 1 Figure S3). Our findings provide frequentist and some Bayesian evidence of intact learning performance and similar learning curves in ASD and neurotypical participants.

and Supplementary
These results seemingly contradict both the predictive processing framework of ASD that suggests impaired statistical learning in ASD 2,13 and empirical findings by Roser et al. 43 , who found superior statistical learning in ASD. On the other hand, they are in line with previous literature that found no impairment in probabilistic statistical learning tasks in autistic children 36,[40][41][42] . These contradictions highlight the possibility that predictive processing in autism might depend on the task used and that some aspects of it may be intact in ASD, which has both theoretical and clinical importance. In the following paragraphs, we will discuss possible explanations for these inconsistencies. First, the general information processing style the task requires might play a role. Second, atypicalities in different components of predictive processing could provide an explanation. As mentioned in the Introduction, the predictive processing framework of ASD is not a monolithic concept but rather an umbrella term that includes different mechanisms that could explain autistic traits/symptoms-these mechanisms are not necessarily mutually exclusive, yet apply a different angle to interpret the results. We did not directly access these mechanisms in our study, moreover, all these approaches face challenges by contradicting empirical results 20,62-64 . Thus, future studies are warranted on them, yet they may still help us understand our results in the context of the predictive processing framework and provide future directions. Lastly, we will discuss the potential role of age in statistical learning.
Based on the work of Roser et al. 43 , we even expected a superior statistical learning performance in ASD, as compared to neurotypical adults but could not replicate their results. An important difference between their task and ours was that their visual statistical learning task presented the learnable regularities on the same slide (that is, it was spatially distributed), whereas in our ASRT task, the learnable regularities were distributed in time (that www.nature.com/scientificreports/ is, temporally distributed). This leads to an important difference that might explain the contradictory results: the local-versus global-level processing involved in these tasks. Roser and colleagues 43 argued that their findings were attributed to the significant engagement of local processing, a cognitive style in which autistic individuals often excel compared to neurotypical peers ( 44 , but again, see 46 for contradicting evidence). It is likely that our task, in comparison with the spatially distributed one used by Roser et al. 43 , requires more global-level integration: if participants fail to integrate the elements that successively occur, their statistical learning might be weaker. Although we acknowledge that acquiring spatially distributed regularities requires global-level integration as well, autistic individuals seem to benefit from a relative predominance of local-level processing 45 . Thus, the difference between our and Roser and colleagues' 43 results may not at all derive from statistical learning, but from the atypicality of local/global processing. Besides the general information processing style, atypically high and inflexible precision of prediction errors in ASD 3 could account for the benefit of probabilistic tasks compared to deterministic ones. Such errors lead autistic people to update the model after each error, rather than contributing the errors to the unavoidable imprecision of the prediction itself. This has an important implication regarding our probabilistic statistical learning task: the constant update of the model might be adaptive in a task where the regularity cannot be fully The brown color indicates the RT of high-probability triplets, and the green color the RT of low-probability triplets. The gap between these two lines indicates the magnitude of statistical learning. We found no significant differences between the groups. The dashed line indicates a 15-min long break. Error bands indicate the SEM. (B) Statistical learning score on RT, in the neurotypical (left figure) and ASD (right figure) groups, by the epochs. Learning scores indicate the RT differences between high-and low-probability triplets, i.e., show how many ms faster participants reacted to the high-probability vs. the low-probability triplets. The blue lines indicate the mean performance of the given group, and the gray lines represent the learning score of individual participants. The dashed line indicates a 15-min long break. We found no significant differences between the groups. Error bands indicate the standard error of the mean in the group. www.nature.com/scientificreports/ learned due to its probabilistic nature. Thus, the constant update based on the prediction errors might lead to a longer learning process-the learning curve of neurotypical participants might peak sooner, as they do not update their model after a certain point, attributing the prediction errors to the imprecision of the otherwise correct model. Meanwhile, ASD participants might keep updating, thus, learning (see also the work of Gazzaniga 65 about frequency-maximizing and frequency-matching strategies). This idea highlights the possibility that autistic predictive processing might depend on the given task type. Yet, this topic needs further investigation as some empirical evidence does not even support the different weighting of prediction errors in ASD (see 39 ), and studies have suggested that some statistical learning tasks are not error-driven 66,67 . It also implies that task length might affect ASD participants differently than neurotypical participants. Namely, neurotypical participants might outperform ASD participants on shorter tasks, but given enough time, ASD participants can catch up, or maybe even exceed the performance of neurotypical ones. Empirical evidence indeed supports this idea. Autistic participants tend to differ from neurotypical ones only in early learning 68 . Although they draw on prior knowledge less than neurotypical individuals, their priors are dominated by longerterm statistics of preceding stimuli, rather than recent ones [69][70][71] . Perhaps as a consequence of the above, they can catch up 72 or even outperform their neurotypical peers by the end of the task ( 42 -note, however, that this difference was only trend-level). Given enough time to learn, the constant updating of the representations might be adaptive in statistical learning. Another potential explanation is that, according to meta-analytic evidence, the overall global/local processing is similar in the autistic and neurotypical groups, but autistic people need more time for global processing than neurotypical people 46 -which might influence learning processes that require global processing. The slower learning dynamics might be an important methodological consideration, as most SRT/ASRT studies where ASD participants performed well, used longer (> 15 min long) learning sessions 40-42 -and our study, with about 40 min of practice provided another example for this. Taken together, the predictive processing of the autistic brain might lead to intact (or if supported by local processing, even superior) performance in case of probabilistic regularities. However, future studies shall address this question to be able to draw firm conclusions.
Atypical use of prior knowledge (vs. using primarily mere sensory input) in ASD might be another way to explain the results. Although empirical evidence often does not support the view that autistic individuals apply weak priors (e.g. [62][63][64] for review see 6,7 ), this might help to understand our results. Performance on probabilistic tasks might benefit more from bottom-up than top-down processes: one has to rely on bottom-up processes, as prior knowledge cannot predict the next event with a 100% probability. Thus, performance on the ASRT task potentially benefits more from bottom-up processes 26 while using priors might even hinder it. With a real-life example, learning the grammar of a foreign language can be harder if we are proficient in another language already: the regularities we learned before in another language can automatically come to our minds instead of the correct grammar. In conclusion, while attributing lower weight to priors might harm performance on some predictive processing tasks, complex probabilistic task performance can even benefit from it.
A growing body of literature aims to capture another type of uncertainty in the prediction process. According to Palmer et al. 10 and Lawson et al. 8 , autistic people in fact struggle with the estimation of volatility, rather than the estimation of the noise inherently present even when the regularity remains the same. Overestimating volatility leads to an aberrant learning process, which adds to the interpretation of our current results: although the ASRT task operates with some uncertainty (as in it is probabilistic), it is not volatile at all, which might explain the intact performance. This issue could be deeper understood by adding volatility to the ASRT task, for example by switching between different sequences to learn (see for example 73,74 ). Such a study would provide insight into how different types of uncertainties affect learning in ASD. Moreover, using computational models such as hierarchical Gaussian filter would enable us to track the learning of volatility individually, c.f. Lawson et al. 8 .
Given that volatility appears to offer an excellent explanation for our results, it would be particularly worthwhile for future studies to explore this concept.
However, statistical learning studies only ever have found an impairment in autistic children, not in adults. Moreover, all the previous studies that used our task showed no statistical learning impairment in autistic children 41,42 , which is in line with our findings on adults. All the studies to date, however, compare autistic individuals to neurotypical peers-to our knowledge, no study to date compared the statistical learning performance of autistic children with autistic adults-even though it might be of relevance, as statistical learning tends to change over the lifespan: neurotypical children can outperform adults on probabilistic tasks 19,47 . Most empirical evidence, including this current paper, suggests similar statistical learning throughout the lifespan in autistic and neurotypical individuals. On the other hand, the nature of the task (e.g., probabilistic/deterministic) might affect this as well, as results found on the SRT task in neurotypical children show a different developmental curve than on the ASRT task 47,48,75 , moreover, several functions show an altered developmental curve in ASD (see 7 for review)-thus, we need further empirical evidence that directly tests this question.
Taken together, our paper aimed to investigate statistical learning in autistic adults from the predictive processing point of view. Predicting probabilistic, temporally distributed regularities seems to be intact, but not superior in ASD. It raises the possibility that predictive processing in ASD, even if it is atypical, can result in intact performance. Importantly, atypicality might affect the performance differently in seemingly similar tasks-here, we discussed how certain factors may contribute to predictive processing in ASD. We would like to inspire future studies not to consider predictive processing as a monolithic concept-for example, the same mechanisms might impair the performance in a deterministic task but not in a complex, probabilistic one. Furthermore, it might be useful for clinicians too; we suggest using strength-based methods in therapy and education of ASD patients, e.g., using probabilistic methods or giving enough time. These suggestions might help understand more about autistic predictive processing, and to autistic individuals to reach their best competencies.